Building a Hierarchy of Events and Topics for Newspaper Digital Libraries
نویسندگان
چکیده
In this paper we propose an incremental hierarchical clustering algorithm for on-line event detection. This algorithm is applied to a set of newspaper articles in order to discover the structure of topics and events that they describe. In the first level, articles with a high temporal-semantic similarity are clustered together into events. In the next levels of the hierarchy, these events are successively clustered so that composite events and topics can be discovered. The results obtained for the F1-measure and the Detection Cost demonstrate the validity of our algorithm for on-line event detection tasks.
منابع مشابه
Proposed content framework for digital literacy education to users in Iran
Aim: today, digital literacy, as a set of skills that enable people to use digital space effectively for success in personal, educational and professional life, has become a necessity in all societies and public libraries are one of the most important providers of digital literacy education in the world. Digital literacy education has not been considered in public libraries in Iran. The first s...
متن کاملTemporal-Semantic Clustering of Newspaper Articles for Event Detection
In this paper we introduce a new clustering algorithm for event detection in newspaper articles, which has two main features. Firstly, it makes use of the temporal references extracted from the document texts to define the document similarity function. Secondly, the algorithm works hierarchically. In the first level, documents with a high temporal-semantic similarity are grouped into individual...
متن کاملشاخص های طراحی و ارزیابی کتابخانه های دیجیتالی
Introduction: There was always suspicion regarding concept and frameworks of digital libraries concepts such as electronic library, virtual library, without wall library, hybrid library and digital library have applied often together, or for each other for conveying library concept. Studies have shown that so far there is no standard and universal accepted definition for digital libraries, howe...
متن کاملسنجش میزان آسیب پذیری ساختمانی کتابخانه های عمومی در مقابل خطر زمینلرزه با استفاده از روش یاگر: بررسی موردی کتابخانههای عمومی بافت مرکزی شهر تبریز
Purpose: The aim of the research is identifying important factors of earthquake vulnerability in buildings of Tabriz public libraries in relatively historic texture of the city and ranking them based on identified and analyzed factors. Methodology: This work is an applied and descriptive research. Data and information of the research have gathered through scientific documents, checklist, field...
متن کاملBuilding a Digital Library of Newspaper Clippings: The Laurin Project
The field of digital libraries has been attracting a lot of research efforts during the last years. Many interesting projects have been started, dealing with the various open issues arising in the field. However, no project has specifically taken into account the problem of building a digital library of newspaper clippings. It is well known that a huge part of cultural knowledge is stored in th...
متن کامل